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Analysis  of  a cooperative  stereo  algorithm 
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D.  Marr,  O.  Palm**  and  T.  Poggio** 


i 

SUMMARY'.  Marr  6 Poggio  (1976)  recently  described  a cooperative 
algorithm  that  solves  the  correspondence  problem  for  stereopsis.  This 
article  uses  a probabilistic  technique  to  analyze  the  convergence  of 
that,  algorithm,  and  derives  the  conditions  governing  the  stability  of 
the  solution  state.  The  actual  results  of  applying  the  algorithm  to 
random-dot  stereograms  are  compared  with  the  probabilistic  analysis.  A 
satisfactory  mathematical  analysis  of  the  asymptotic  behaviour  of  the 
algorithm  is  possible  for  a suitable  choice  of  the  parameter  values  and 
loading  rules,  and  again  the  actual  performance  of  the  algorithm  under 
these  conditions  is  compared  with  the  theoretical  predictions. 

Finally,  some  problems  raised  by  the  analysis  of  this  type  of 
"cooperative"  algorithm  are  briefly  discussed. 
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Spcmannstrasse  38,  Germany. 
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1 Introduction 

The  extraction  of  stereo-disparity  Information  from  two  images 
depends  upon  establishing  a correspondence  between  them.  In  a recent 
article,  Marr  6 Poggio  (1976)  analyzed  the  nature  of  the  correspondence 
computation  and  derived  a cooperative  algorithm  that  Implements  it. 
Although  several  examples  were  given  of  the  performance  of  the 
algorithm  on  random-dot  stereograms  (Marr  6 Poggio  1976,  figures  3-6), 
space  did  not  permit  a thorough  analysis  of  the  fixed  points  of  the 
algorithm,  or  of  its  convergence.  In  this  article,  we  shall  examine 
these  Issues  in  detail. 

1.1  Computational  structure  of  the  correspondence  problem 

Marr  6 Poggio  (1976)  argued  that  the  stereo  problem  may  be  reduced 
to  that  of  matching  two  primitive  descriptions,  one  from  each  eye. 

They  showed  that  the  central  problem  is  to  find  a correspondence 
between  the  left  and  right  descriptions,  that  satisfies  the  two  rules 
(p.  284  and  Marr,  1974): 

(Rl)  Uniqueness : Each  item  from  each  image  may  be  assigned  at  most  one 
disparity. 

<R2)  Continuity,  Disparity  varies  smoothly  almost  everywhere. 

By  constructing  an  explicit  geometrical  representation  of  these  two 
rules  (figure  lc),  they  were  able  to  derive  a cooperative  algorithm 


1.  Figure  la  shows  the  explicit  structure  of  the  two  rules  Ai  and  A2 
for  the  case  of  a one-dleenslonal  laage,  and  It  also  represents  the 
structure  of  a network  for  lapleaentlng  the  algorltha  described  by 
equation  1.1.1.  Solid  lines  represent  "inhibitory"  interactions,  and 
dotted  lines  represent  "excitatory"  ones,  lb  gives  the  local  structure 
at  each  node  of  the  network  la.  This  algorltha  may  be  extended  to  two- 
diaenslonal  laages,  In  which  case  each  node  in  the  corresponding 
network  has  the  local  structure  shown  In  lc.  (Marr  f Pogglo  1976  figure 
2). 
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that  implements  them.  If  one  thinks  of  figure  la  as  a network,  with  a 
cell  at  each  node,  the  uniqueness  rule  HI  means  that  only  one  cell  is 
"on"  along  each  vertical  or  horizontal  line  (the  line  of  sight  from  the 
left  and  right  eyes);  and  the  continuity  rule  H2  implies  that  solutions 
(its  asymptotic  states)  tend  to  spread  along  the  dotted  diagonals 
(lines  of  constant  disparity). 

In  order  to  implement  these  rules,  each  cell  sends  "inhibitory" 
connections  to  all  other  cells  along  the  same  vertical  and  horizontal 
lines,  and  excitatory  connections  along  its  diagonal.  This  gives  the 
local  network  geometry  shown  in  figure  ib.  For  a two-dimensional 
image,  the  only  change  needed  is  to  make  the  excitatory  neighborhood 
two-dimensional,  which  gives  the  local  geometry  shown  in  figure  lc. 

Let  Cx.(/id  dcnote  the  state  at  time  t of  the  cell  corresponding  to 
coordinate  (x,  y)  on  the  left  retina,  matching  position  (x«-d,  y)  on  the 
right  retina.  Let  S(xyd)  denote  its  excitatory  neighborhood  (the  disc 
in  figure  lc),  and  O(xyd)  Its  inhibitory  neighborhood  (the  horizontal 
and  vertical  lines  in  figure  lc).  The  algorithm  Implemented  by  the 
network  may  be  written  (Marr  6 Poggio  1976,  equation  2) 


1.1.1 


A t+i) 
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where  visa  threshold  function  that  takes  values  0 or  1,  and  < is  an 
"inhibition"  constant. 

This  article  is  concerned  with  the  properties  of  the  algorithms 
defined  by  equation  1.1.1  or,  equivalently,  with  the  behavior  of  the 
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corresponding  network  (fig.  1).  The  two  inputs  to  the  algorithm  or 
network,  from  which  the  initial  state  of  the  network  is  determined,  are 
usually  two  matrices  whose  entries  consist  of  0's  and  l's.  The  second 
matrix  is  constructed  from  the  first  by  x-translations  of  regions  of 
it.  As  we  shall  discuss  later  the  algorithm  defined  by  equation  1.1.1 
has  some  analogies  with  games  like  "life". 

The  plan  of  the  paper  is  as  follows:  Section  2 describes  the 
loading  rules,  which  determine  the  initial  state  from  the  input 
stereograms,  and  also  defines  the  algorithm  precisely.  The  relations 
between  the  fixed  points  of  the  algorithm  and  the  states  that  satisfy 
the  two  conditions  R1  and  R2  are  then  discussed  (section  3).  A 
probabilistic  approach  to  the  convergence  of  the  algorithm  is  outlined 
in  section  4.  Actual  computer  simulations  of  the  algorithm  are 
compared  with  the  probabilistic  analysis,  and  the  range  of  parameter 
values  that  yield  a "nice"  convergence  is  discussed.  Some  special 
situations  are  also  analyzed  (section  5).  A suitable  (and  restrictive) 
choice  of  the  parameter  values  in  eq.  1.1.1  allows  a satisfactory 
mathematical  analysis  of  the  algorithm:  section  6 is  devoted  to  such 
an  approach.  Finally,  we  briefly  discuss  the  mathematical  problems 
raised  by  the  analysis  of  this  type  of  "cooperative"  algorithm. 
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2 The  algorithm 


2.1  Loading  conditions 

Let  the  positions  on  the  left  and  right  retinas  be  denoted  by  Lx  u 
and  fiK  v respectively.  These  arrays  take  the  values  0,  indicating  the 
absence  of  a feature,  or  1,  indicating  the  presence.  The  initial 
condition  of  the  network,  for  stereogram  L,  R is  given  by 

2.1.t  cx,ytd  ' Lx,u  ‘ *x+d,y 

within  the  appropriate  range  d of  disparity. 

2.2  The  algorithm 

The  relation  between  states  at  times  t and  t*l,  is  given  by  the 
recurrence  relation  eq  1.1.1,  where  a is  a sigmoid  function  in  general, 
and  here  is  taken  to  be  the  threshold  function 
2.2.1  a(u)  * 1 if  u * #, 

■r  0 otherwise. 

« is  a constant,  known  as  the  "inhibition  constant".  The  number  of 
disparity  layers  d we  shall  denote  by  0,  and  we  shall  let  N be  the 
diameter  of  the  excitatory  neighborhood  S(x.y.d).  In  the  example  shown 
in  figure  2,  n * 5,  and  the  total  number  of  cells  in  an  excitatory 
neighborhood  is  13.  The  number  less  the  cell  Itself  is  12,  which  we 
shall  denote  by  f.  The  number  of  cells  in  an  inhibitory  neighborhood 


2.  The  excitatory  neighborhood  (figure  lc)  used  In  our  lapleaentatlon 
has  a dlaaeter  of  5,  and  contains  13  cells.  The  central  cell,  marked 
by  a square,  receives  at  aost  12  excitatory  Inputs  froa  Its  neighbours. 
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of  a given  cell  is  20  - 2,  excluding  the  cell  itself. 

2.3  Parameter  values  and  some  facts 

The  parameter  values  chosen  for  our  original  algorithm1  (Marr  6 
Poggio,  1970)  were  £ « 12,  0 - 7,  • * 2,  0 * 4,  with  the  excitatory 
neighborhood  shown  in  fig.  2.  Among  other  constraints,  these  parameter 
values  were  chosen  to  satisfy  the  following  conditions: 

2.3.1  in  the  absence  of  inhibition  and  of  a contribution  from  the  term 
C°,  straight  line  borders  should  fill  in  as  shown  in  fig.  3a.  This  is 
true  when  0 s 4. 

2.3.2  straight  line  borders  between  two  "filled-in"  planes  at 
different  disparities  should  not  grow.  This  requires  that  4 - 2<  < 0. 

2.3.3  with  the  particular  values  chosen: 

--  a pattern  of  five  connected  points  is  the  smallest  configuration 
that  can  survive  (see  fig.  3b).  It  will  not  grow  unless  one  other 
point  is  added  (e.  g.  at  P in  fig.  3b). 

--  the  sharpest  convexity  capable  of  surviving  against  one  inhibition, 
with  the  help  of  a contribution  from  C°  is  a right-angle.  Fig.  3c 
shows  that  the  condition  is  6 - « * #. 

--  a convex  or  flat  border  cannot  grow  against  one  inhibition;  it  can 
grow  only  into  scattered  active  cells. 

--  the  least  concave  patterns  capable  of  growing  under  two  inhibitions 
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3.  The  total  excitatory  contribution  for  various  configurations  of 
"on"  cells.  The  excitatory  neighborhood  (figure  2)  is  shown  with  open 
circles,  except  for  the  central  cell  which  is  indicated  by  a square 
because  it  Bakes  no  contribution  to  the  total  excitation.  With  a 
threshold  of  4.0;  3a  shows  that  a flat  border  will  grow  in  the  absence 
of  inhibition,  3b  exhibits  the  smallest  stable  configuration,  3c  the 
sharpest  stable  convexity,  and  3d  i e show  concavities  that  fill  in. 
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are  shown  in  figs.  3d  6 e.  They  fill  in  by  one  or  two  cells  and  then 
are  no  longer  cone,  ve  enough  to  grow  under  two  inhibitions. 

3 Invariant  states  and  the  matching  rules 

The  matching  rules  for  stereopsis  that  were  given  in  the 
Introduction  take  the  following  fore  for  the  algorithe  discussed  here: 

(1)  Uniqueness-.  Each  item  from  each  image  may  be  assigned  at  most  one 
disparity  value. 

(2)  Continuity:  Disparity  does  not  change  almost  everywhere. 

Comment'.  R2  has  now  taken  a slightly  different  form.  This  is  because 
disparity  takes  only  discrete  value  in  this  algorithm.  Images 
containing  smoothly  varying  disparities  may  be  handled  by  a modified 
version  of  the  algorithm,  which  will  be  discussed  in  section  5. 

We  now  show  that  the  states  in  which  these  two  rules  are  obeyed 
are  for  all  practical  purposes  invariant,  i.e.  they  are  fixed  points  of 
eq.  l.l.l,  and  once  achieved,  do  not  change  in  subsequent  iterations. 

3.1  Configurations  that  satisfy  the  matching  rules  are  invariant 

The  continuity  and  uniqueness  conditions  mean  that,  for  each  value 


4.  The  solid  lines  indicate  solution  planes  (c / figure  la).  Lines  of 
sight  PQr,  PQl  Intersect  solution  planes  at  only  one  point  P,  except 
possibly  near  the  (rare)  disparity  boundaries  like  A.  Thus 
configurations  that  obey  rule  A1  are  invariant. 
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of  y,  a cross-section  of  the  network  has  the  appearance  shown  in  figure 
4 (the  continuity  condition  also  requires  that  the  active  segment  has 
some  extension  in  the  g direction).  That  is,  the  "on"  cells  in  the 
network  form  extended  segments  like  that  shown  as  AB  (continuity),  and 
most  lines  of  sight  (e.p.  PQL,  PQR)  Intersect  only  one  of  these 
extended  segments  (uniqueness).  Some  lines  of  sight  (e.g.  to  D)  may 
Intersect  two  planes:  this  occurs  only  at  the  (rare)  boundaries  at 
which  disparity  changes.  The  physical  situation  is  that  one  surface  is 
obscuring  the  other. 

We  show  now  that  these  configurations  are  invariant  if  the 
parameter  values  are  appropriate. 


(i)  Interior  points  like  P are  certainly  invariant  if 

3.1.1  £ i # 

x',y',d'  t S(x,y,d) 

if  P is  Interior  In  both  x and  g. 

(11)  Eq.  3.1.1  implies  that  boundary  points  like  A (fig.  4)  on  a 
straight  boundary  (in  the  x-g  plane)  will  not  grow  Into  the  Interior  of 
an  existing  segment  at  another  disparity  provided  that 

3.1.2  E/2  * 1 - 2*  < 9 

Concave  pieces  of  boundaries  can  In  principle  grow,  but  not  much  for 


Lx 


S.  The  two  possible  stable  edges  for  flat  boundaries.  Depending  on 
the  Initial  conditions,  edges  can  occur  that  are  defined  by  the  line 
where  cells  begin  receiving  one  (A)  or  two  (B)  Inhibitions  froa  the 
other  surface. 


Cooperative  computation 


14 


Marr,  Palm  6 Poggio 


H 

two  reasons.  Firstly,  boundaries  cannot  be  everywhere  concave,  and 
secondly,  with  our  particular  excitatory  neighborhood  and  parameter 
values  (see  figs.  3d  h e)  the  amount  a concave  border  can  fill  in  is 
limited  to  at  most  two  elements.  Fig.  5 shows  the  two  possible  stable 
edges  for  flat  boundaries. 

3.2  Mo t all  invariant  configurations  satisfy  the  matching  rules 

Strictly  speaking,  the  converse  result  to  that  of  the  last  section 
is  not  true.  A counter-example  to  the  uniqueness  condition  that  is 
stable  with  our  parameters  appears  in  figure  6.  Interior  points  of  a 
plane,  wholly  surrounded  by  other  points  in  the  same  sheet,  can  survive 
inhibition  from  two  other  cells  and  so  can  boundary  points  where  the 
boundary  is  straight.  In  figure  6b,  points  of  these  two  types  are  the 
only  ones  that  occur.  A counter-example  to  the  continuity  condition 
appears  in  figure  7,  and  it  is  left  as  an  exercise  to  show  that  this 
pattern  is  invariant.  In  practice,  neither  of  these  configurations  can 
actually  develop  from  a random-dot  stereogram. 

When  the  input  consists  of  two  stereograms  portraying  a single 
surface,  the  probabilistic  analysis  of  the  next  section  shows  that  with 
high  probability,  the  solutions  will  in  fact  obey  the  uniqueness 
condition. 

If  the  input  stereograms  portray  a transparent  surface  in  front  of 


another  surface,  the  algorithm  with  our  parameter  values  will  usually 
fail  to  represent  the  input  accurately,  tending  Instead  to  develop  a 
solution  that  obeys  the  two  conditions  and  consists  of  a mosaic  of 


! 


6.  A stable  geometrical  configuration  that  violates  the  uniqueness 
condition  (6a).  The  central  square  consists  of  two  planes,  one  at 
disparity  2 and  one  at  disparity  0.  This  configuration  is  a stable 
state  of  the  algorithm,  in  the  sense  that  if  it  is  loaded  directly  into 
the  network,  an  invariant  configuration  is  quickly  reached  in  which 
both  planes  are  represented.  Figure  6b  demonstrates  this.  The 
stereogram  is  marked  Left  and  Right,  and  5 Iterations  of  the  algorithm 
are  shown.  If  the  network  is  loaded  in  the  usual  way,  however,  the 
algorithm  develops  a solution  that  is  a mosaic  of  patches  from  the  two 
levels  (6c). 
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patches  from  the  two  levels  (fig.  Gc).  With  the  parameters  we  chose, 
there  seems  to  be  no  convenient  and  precise  definition  of  the  stability 
of  configurations  that  forces  the  uniqueness  and  continuity  of 
solutions.  For  instance,  even  it  one  requires  in  addition  to 
invariance  some  kind  of  spatial  stability 2,  the  counter-example  of  fig. 

6 cannot  be  avoided,  although  a reasonable  "spatial  stability" 
condition  would  exclude  the  counter-example  of  fig.  7. 

If  one  could  exclude  significant  overlaps  between  surfaces  lying 
at  different  disparities,  it  appears  that  one  can  derive  the  continuity 
conditions  for  invariant  configurations.  The  argument  is  based  here  on 
the  notion  of  a hole3,  and  shows  by  straightforward  geometry  that  holes 
are  not  invariant. 

In  one  dimension  (in  which  the  network  consists  only  of  the  part 
shown  in  fig.  la)  the  problem  of  this  section  becomes  easier. 
Apparently,  the  only  way  of  reducing  the  2-dimensional  problem  to  a 
satisfactory  state  is  by  changing  the  parameter  values  (see  section  6). 

4 Probabilistic  analysis  of  the  algorithm 

We  have  been  unable  to  obtain  general  results  about  the 
convergence  of  this  type  of  algorithm.  Standard  approaches  --  e.g. 
Liapunov-type  methods  and  the  usual  fixed  point  theorems  --  apparently 
fall  in  this  situation  for  reasons  that  we  shall  mention  in  the 


discussion. 


7.  A stable  geometrical  configuration  that  violates  the  continuity 
condition.  At  each  of  two  disparity  values,  the  "on"  cells  fore  a 
checkerboard  pattern,  but  they  are  arranged  in  such  a way  that  neither 
level  can  fill  ln»  because  of  inhibition  froa  the  other. 
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The  probabilistic  analysis  given  here,  although  not  completely 
satisfactory,  nevertheless  provides  useful  Information  about  the 
algorithm’s  convergence  for  random-dot  stereograms.  Strictly  speaking 
its  application  is  restricted  to  inputs  with  a random  structure. 

The  idea  behind  our  analysis  is  that  the  cells  in  the  network  can 
be  divided  into  populations  on  which  the  excitatory  and  inhibitory 
Inferences  are  statistically  homogeneous  (c / Marr  1971).  Our  analysis 
is  very  specific  to  the  algorithm  of  eq.  1.1.1,  because  the  way  in 
which  the  cells  are  divided  Into  populations  depends  critically  on  the 
geometry  of  the  algorithm  and  on  our  o priori  knowledge  of  its 
invariant  state. 


4.1  Assumptions  and  notation 

The  algorithm  has  the  structure  shown  in  fig.  1 and  the  network  is 
loaded  from  the  input  as  specified  by  eq.  2.1.1.  We  shall  assume  that 
the  inputs  have  the  following  properties. 

4.1.1  the  1 * s in  each  image  occur  randomly  with  probability  v,  and  the 
autocorrelation  of  each  input  sequence  (for  any  given  p)  is  a Kronecker 

6. 


4.1.2  the  input  admits  a unique  solution  surface  that  is  large  enough 
to  neglect  boundary  effects. 
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Condition  4.1.2  means  that  the  left  input  is  equal  to  the  right  one, 
modulo  x-translation.  Condition  4.1.1  implies  that  in  the  initial 
state  of  the  network  C,  the  density  of  l's  on  the  solution  layer  equals 
v,  and  elsewhere  it  is  v2.  We  subdivide  the  cells  into  five 
populations,  by  classifying  them  in  two  ways: 

til  according  to  whether  or  not  they  are  a "on"  in  the  initial  state 
C^,  and 

(ii)  according  to  the  number  of  active  inputs  from  the  images. 

We  draw  both  the  populations  0 and  1 from  cells  that  lie  on  the 
solution  layer;  population  0 is  defined  to  receive  no  active  inputs 
from  the  image,  and  population  1 receives  two.  Notice  that  there  are 
no  cells  in  the  solution  layer  that  receive  exactly  one  active  input. 

The  other  three  populations  that  we  define  refer  to  cells  that  lie 
off  the  solution  layer;  population  11  receives  two  active  inputs  from 
t he  image,  population  10  receives  one,  population  00  receives  none. 

The  five  populations  (0,  1,  11,  10,  00)  are  exclusive  and  exhaustive. 

We  denote  by  p0(t),  pj(t),  etc.  the  probability  that  a cell  in  the 
respective  population  is  "on"  at  time  t.  The  goal  of  our  analysis  is 
to  express  the  values  of  the  pr( t)  in  terms  of  pK(t-l)  for  the  various 
populations  This  allows  us  to  examine  the  convergence  numerically, 
and  wo  say  that  a solution  is  achieved  at  time  T when 

P0<t)  * pt( t)  * 1,  and 

p00(  t)  * pl0( t)  * Pjj(t)  * 0,  for  every  t z T. 

The  critical  assumption  here  is  that  the  quantity  completely 

describes  the  structure  of  active  cells  in  the  respective  population 


W 

p»  
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w • This  assumption  is  true  for  the  initial  iteration  and  only 
approximate  thereafter.  We  shall  discuss  this  point  at  the  end  of  the 
sec  t ion. 

4.2  Formulae 

The  state  of  a cell  (x.y.d)  at  time  (t*l)  depends  upon  the  number 
of  active  cells  in  its  excitatory  S(x,y,d)  and  Inhibitory  O(x.y.d) 
neighborhoods  at  time  t. 

If  we  denote  the  populations  to  which  the  cell  belongs  by  n,  (ir 
running  through  the  five  populations  0,  1,  00,  11,  01),  let  us  define: 

1 I 

rn<r)  to  be  the  probability  that  exactly  r cells  are  "on"  in  the 
excitatory  neighborhood  S(x,y.d)  at  time  t and 

1 

iw(r)  to  be  the  probability  that  exactly  r cells  are  "on"  in  the 
inhibitory  neighborhood  0(x,u.d ) at  time  t. 

It  is  convenient  to  introduce  some  further  quantities: 

qs(t)  is  the  probability  that  a given  cell  on  the  "solution"  plane 
is  active  at  time  t. 

is  the  probability  that  a given  cell  elsewhere  in  the  network 
is  active. 
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q_(t)  is  the  probability  that  a given  cell  is  active  in  the 
inhibitory  neighborhood  of  a cell  in  the  population  0. 

q*(t)  is  the  probability  that  a given  cell  is  active  in  the 
inhibitory  neighborhood  of  a cell  in  the  population  1. 

Then: 

4.2. 1 <rs( t)  * p0(t).(i  - t>)  * pt(t).p 

<r„(t)  ' p0o< * p0i(t).2*(t  - v)  ♦ ptl(t).v2 

v.(t)  * p00(t).(l  - v)  * p01( t)  . V 

*+< * plt(t).¥  *■  p10(t).a  - *) 

Writing  B(n,  /»  m)  * nCn.f*(t  - where  mCH  is  the  binomial 

coefficient,  we  have  immediately 

4.2.2  ej(r)  x e0(r)  * B(r,  E) 

i,(r)  x B(r,  2D  - 2) 

\0<r)  * B(r.  qjt)\  2D  - 2) 

ejj(r)  x e00(r)  x e10(r)  x B(r,  q9(t)t  E' 

The  remaining  are  more  difficult  to  obtain,  since  the  inhibitory 
contributions  to  cells  lying  off  the  solution  plane  come  from  cells 
lying  on  the  solution  plane  and  from  cells  lying  off  the  solution 
plane,  and  these  two  populations  obey  different  statistics.  In  fact 
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I 

L 


4.2.3  iji(r)  * lp1(t)J2.B(r  - 2.  q+(t),  20  - 4)  * 

* 2p1(t).(l  - pt(t)).B(r  - l,  q+(t)i  20  - 4)  * 

* [1  - pl(t)]z.B(r . q+(t),  20  - 4) 


* 00<r> 


lPp(t)]2 .B(r  - 2,  qjt),  20  - 4)  *■ 

♦ 2p0(t).(l  - p0(t)).B(r  - 1.  q.( t),  20  - 4)  * 


♦ H - P0(t)]2  B(r,  qjt),  20  - 4) 


The  final  case  i10  is  especially  awkward,  because  along  one  cf  the 
inhibitory  lines  the  probability  of  a cell  being  "on"  is  q+  and  along 
the  other  diagonal  it  is  q_. 


4.2.4 


iJ0(r) 


X {pt(t)B(k  - 1.  q+(t),  0-2)  * 

(1  - p1(t))B(k,  qj t),  0-2 )}• 
{p0(t).B(r-k+l,  qjt),  0-2)  * 

(1  - p0(t))B(r-k,  qjt),  0 - 2)) 


We  now  need  to  relate  the  pJt*l)  and  the  pw(t)  in  terms  of  the  e_  and 


For  each  cell  population  we  know  the  distributions  of  incoming 


excitation  and  inhibition,  and  we  know  that  a cell  will  be  on  whenever 
the  excitations  exceed  the  inhibitions  by  at  least  9 . Hence 


4.2.5 


.t+1 


7. 

n m to  E 
m = 0 to  2D-2 
n - em  - 0 


ei  <"> 


j 


m 
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where  |f  / ( - i for  ir  • 1,  w • 11 


•w  '9 


otherwise 


If  the  input  term  c"  of  eq.  1.1.1  is  neglected,  9n  • 6 for  all  w. 

The  equations  4.2.6  are  too  complex  to  be  solved  analytically. 
Numerical  solutions  were  however  obtained  for  various  values  of  the 
parameters  and  some  of  the  results  are  given  in  table  1 and  figure  8. 

4.3  Range  oj  parameter  values  and  comparison  with  actual  runs 

Figure  8 exhibits  the  performance  of  the  algorithm  for  stereograms 
having  densities  of  from  0.5  to  0.05.  Table  1 gives  the  statistics 
that  were  measured  from  these  runs,  and  also  the  parameters  predicted 
by  the  probabilistic  theory.  The  values  obtained  from  the  theory  match 
those  from  the  algorithm  quite  well  for  the  first  iteration,  but  except 
for  the  case  v - 0.05,  they  diverge  quite  rapidly  thereafter,  and  even 
this  case  diverges  by  the  third  iteration. 

We  have  already  noted  the  main  reason  for  the  discrepancy.  The 
Assumption  that  the  statistical  structure  of  various  populations  is 
purely  random  (inside  each  population  and  between  populations)  holds 
exactly  for  the  first  Iteration  but  only  approximately  thereafter, 
because  the  operator  of  fig.  lc  has  a local  structure  which  can 
preserve  local  clusters  of  active  cells.  There  are  two  ways  in  which 
this  affects  our  probabilistic  description  for  the  second  and 
subsequent  iterations.  The  first  is  that  clusters  are  more  stable  than 


Algorithm 


CO  to  &. 
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the  assumption  of  randomness  would  predict.  Thus  clusters  forming  on 
the  s.olution  layer  will  in  certain  circumstances  change  the  rate  of 
convergence  predicted  by  the  randomness  assumption. 

The  difficulties  arise  where  clusters  form  off  the  solution  layer. 
These  will  again  tend  to  be  more  stable  than  our  analysis  assumes,  but 
their  effect  acts  against  convergence.  However,  we  shall  argue  that 
the  probability  of  large  "wrong"  clusters  is  small  for  most  patterns. 

In  fact,  the  typical  value  of  the  probability  that  a wrong  cell  is  "on" 
after  the  first  iteration  lies  around  0.1.  The  probability  (after  the 
first  iteration)  of  a self-supporting  3 x 3 cluster  at  a given 
position  in  a wrong  layer  (assuming  that  the  cluster  was  absent  in  the 
initial  state  and  accepting  the  oversimplified  assumption  of  randomness 
after  the  first  iteration)  is  about  10"^,  and  hence  less  than  10~4  that 
one  exists  off  the  solution  plane  somewhere  in  the  network. 

A cluster  of  this  size  may  survive  permanently,  because  every 
element  in  it  has  at  least  6 cells  in  its  excitatory  neighborhood,  and 
this  is  enough  to  resist  1 inhibition).  The  probability  of  this  or 
something  larger  arising  by  chance  is  so  small  that  if  it  occurs  it  is 
likely  to  be  a consequence  of  the  particular  image.  In  fact,  some 
small  "wrong"  patches  do  sometimes  occur  (inspect  Marr  $ Poggio  1976 
fig.  5d)  but  such  instances  can  usually  be  traced  to  an  accidental 
correlation  in  the  image.  In  this  sense,  extended  patches  are 
"correct"  solution  regions. 


The  second  effect  that  leads  to  discrepancies  between  the  theory 
and  the  behavior  of  the  algorithm  is  also  a side-effect  of  clustering, 
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<j  1 

since  as  well  as  being  stable,  the  clusters  tend  to  concentrate  ’'on” 
cells  more  than  the  randomness  assumption  would  predict.  For  example, 
at  iteration  2 of  the  case  v ■ 0. 25  (fig.  8b),  although  the  overall 
density  of  ones  on  the  solution  plane  is  about  0.39,  it  is  far  from 
true  that  each  cell  can  expect  to  find  0. 39E  "on"  cells  in  its 
excitatory  neighborhood.  Cells  in  the  filled  in  regions  have  almost 
all  their  neighbors  on,  whereas  those  in  the  interstices  have  none. 

Convergence  is  achieved  by  a growth  outwards  that  fills  in  the  blank 
regions,  but  although  it  is  steady,  it  is  necessarily  slower  than  the 
theory  predicts. 

5 Observations 

5.1  There  is  a wide  latitude  in  the  range  of  parameters  for  which  the 
network  converges.  Table  2 shows  firstly  the  wide  range  in  stereogram 
density  v that  is  tolerated  by  our  parameters  (with  fixed  9),  and 
secondly,  for  a fixed  value  of  v (v  * 0.5)  gives  some  idea  of  the  range 
of  the  other  parameter  values  for  which  the  network  will  converge. 

Note  that  in  the  implementation  described  by  Marr  8 Poggio  (1976),  the 
threshold  was  not  fixed,  but  was  determined  by  the  density  of  "on" 
cells  in  the  network.  This  allowed  solution  to  the  matching  problem 
over  a very  wide  range  of  dot  densities. 

5.?.  Let  us  define  the  probability  that  a cell  on  the  solution  layer  is 


TABLE  2.  The  algorithm  of  eq.  1.1.1  converges  for  a wide  range  of  control 
parameters.  Tables  2a,  b show  convergence  for  v * 0.5  and  v = 0.1  with 
the  same  parameters.  Table  2c  shows  convergence  for  an  entirely  different 
set  of  parameters. 
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"on"  at  time  t to  be 

pr(t)  * v.pj(t)  * (l  - I i)Pp(t) 

and  the  probability  that  a cell  off  the  solution  layer  is  on  at  time  t 

as 


P„(t)  pZ.pjjft)  * 2v(t  - v)p10(t ) * (1  - v2)p00. 

In  a successful  run,  pr  converges  to  1 and  pw  to  0.  With  our 
particular  parameters,  convergence  is  monotonic  if  it  occurs.  This  is 
not  true,  however,  for  the  individual  quantities  pt,  p0,  p pJ0,  p00, 
neither  is  it  true  of  pr  and  pu  for  all  values  of  the  parameters  (see 
table  2). 


5.3  We  have  already  seen  that  the  sharpest  local  corner  capable  of 
resisting  1 inhibitory  input  is  about  90°  or  more,  hence  thin,  sharp 
regions  will  tend  to  be  rounded  off  locally  (see  Marr  h Poggio  19  70 
fig.  Sc).  The  exact  shape  of  the  input  pattern  is  preserved  only  up  to 
this  limit. 

5.4  Minimum  site  vs.  disparitp 

A natural  consequence  of  the  structure  of  the  algorithm  is  that 
the  minimum  resolvable  area  of  a small  pattern  against  a background 
increases  with  disparity  (see  Marr  6 Poggio  1976,  fig.  6).  We  give  an 


9.  The  alnlaua  resolvable  area  of  a saall  pattern  against  a background 
increases  with  disparity.  To  prevent  the  background  froa  filling  In 
coapletely,  the  length  of  the  patch  in  the  x-directlon  aust  be  at  least 
d * 2 (9a).  9b  shows  the  circles  of  disasters  3,  5 and  7 used  in 
figure  6 of  Marr  6 Pogglo  1976. 
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estimate  of  the  dependence  of  minimum  patch  size  on  disparity 
difference.  Consider  a section  for  some  fixed  y of  the  network  (fig. 

9).  Assume  that  the  patch  and  background  regions  are  filled  in.  The 
condition  for  growth  at  a point  (x,  y,  d)  under  1 inhibition  is  that 
the  number  of  "on"  cells  in  an  excitatory  neighborhood  should  be  not 
less  than  $ * < - C°  * 5 or  6,  depending  on  the  Initial  conditions. 
From  fig.  3 we  see  that  flat  or  convex  regions  will  not  grow  whereas 
concave  regions  will.  Hence  our  small  patch  will  not  tend  to  grow, 
whereas  the  background  will  spread  until  stopped  by  two  inhibitions. 

Ue  see  from  fig.  9a  that  to  prevent  the  background  from  filling  in 
completely  (which  would  subsequently  destroy  the  patch  because  convex 
borders  cannot  survive  two  inhibitions),  the  length  of  the  patch  in  the 
x direction  must  be  at  least  d * 2.  This  condition  must  hold  for  at 
least  three  adjacent  lines  aligned  in  the  y direction.  Fig.  6 of  Marr 
$ Poggio  (1970)  illustrates  the  approximate  validity  of  this  relation. 
Fig.  9b  shows  the  sizes  of  circles  of  diameter  3,  5,  and  7 used  in  the 
input  for  that  figure.  These  precise  patterns  do  not  necessarily 
emerge  in  the  appropriate  layer  of  the  network  because  of  the  random 
nature  of  the  borders.  The  circle  of  diameter  3 contains  no  3 x 3 
subset  and  therefore  does  not  survive  at  any  disparities.  The  circle 
of  diameter  5 contains  one  3x3  square  and  survives  as  expected  at 
disparity  1;  it  also  survives,  apparently  accidentally,  at  disparity  2, 
but  not  at  disparity  3.  The  circle  of  diameter  7 contains  one  5x5 
square  and  thus  survives  at  disparity  3. 

A trivial  consequence  of  this  analysis  is  that  horizontal  stripes 
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(parallel  to  the  x axis)  are  in  general  more  stable  than  vertical  ones 
(parallel  to  the  y axis).  The  minimum  thickness  for  horizontal  stripes 
is  about  3 and  is  independent  of  disparity  whereas  the  minimum 
thickness  for  vertical  stripes  is  about  <f  *2  (see  fig.  10). 

5.6  Uncorrelated  areas 

If  there  exists  a sufficiently  large  area  in  the  input  where  there 
is  no  correlation  between  the  two  images,  the  network  will  detect  it 
(see  figs.  5 and  6 of  Science).  After  the  first  iteration  (with  our 
parameter  values  and  v » 0.  5)  only  a few  cells  remain  "on"  in  the 
uncorrelated  region,  but  provided  the  region  is  sufficiently  large  they 
will  receive  no  inhibition  from  the  surrounding  more  organized  layers. 
Hence  those  cells  that  are  on  may  act  as  germs  for  small  regions  that 
have  become  stable  by  the  time  the  surround  encroaches  upon  them,  e.g. 
fig.  5d  of  Marr  % Poggio  (1976).  Relatively  small  (<<  d)  uncorrelated 
areas  probably  have  to  develop  stable  platelets  to  survive  (see  fig.  6d 
of  Marr  6 Poggio  1976),  and  large  uncorrelated  areas  decompose  into  a 
random  mosaic  of  stable  platelets  (see  figure  11). 

Uncorrelated  areas  can  be  recognized  as  such  during  the  read-out 
from  the  network,  when  the  l's  that  appear  in  the  solution  found  by  the 
network  are  used  to  establish  an  explicit  correspondence  between  the 
two  images. 

6.6  Extension  to  images  in  which  disparity  varies  continuously 

The  algorithm  of  eq.  1.1.1  with  the  loading  rules  of  eq.  2.1.1  can 
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deal  only  with  images  having  discrete  disparity  values.  The  disparity 
in  natural  images  commonly  varies  continuously.  There  are  two 
approaches  to  this  problem.  One  is  to  Incorporate  the  representation 
of  continuous  values  directly  into  the  algorithm,  and  the  other  is  to 
use  the  same  algorithm,  but  with  special  rules  for  loading  it  and  for 
interpreting  its  final  state. 

The  first  approach  would  clearly  lead  to  a considerably  different 
algorithm,  perhaps  more  along  the  lines  of  the  networks  studied  by 
Wilson  6 Cowan  (1973),  (see  also  H.  R.  Wilson  1977).  Such  an  algorithm 
could  not  be  treated  within  the  framework  of  this  article. 

The  second  approach  does  not  require  any  changes  in  the  analysis 
of  the  algorithm  itself.  One  could,  for  exampe,  define  the  loading 
conditions  as  follows: 

Let  A be  the  disparity  attached  to  a possible  correspondence 
between  items  in  the  left  and  right  images.  For  integral  d, 

5.6.1  If  cr-f)£A<d  + i),  load  the  cell  corresponding  to  disparity 
level  d in  the  network. 

For  surfaces  whose  disparity  does  not  oscillate  too  much  or  too 
densely,  the  value  ij  • 0.  5 will  lead  to  satisfactory  results.  The 
final  state  of  the  network  establishes  a correspondence  between  items 
in  the  left  and  right  images,  but  their  associated  disparity  is  read 
not  from  the  network  (f.e.  d)  but  directly  from  the  input  (t.e.  A). 


Confusions  may  of  course  arise  in  the  correspondence  established  by  the 
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network  if  the  value  of  d spans  the  disparity  range  too  coarsely. 

In  order  to  deal  with  surfaces  that  are  less  well-behaved,  one  can 
incorporate  some  hysteresis  into  the  loading  rules.  The  loading 
process  then  consists  of  the  following  steps: 

5.6.2  Load  cells  according  to  6.6.1  with  tj  = 0.  3 (say). 

5.6.3  Moving  across  the  image  (x,  y)  in  a spatially  ordered  way,  if 
possible  match  (x.  y,  A)  was  not  loaded  by  5.6.2,  adopt  the  following 
procedure: 

Let  d~  = Integral  part  of  A,  d*  * 1 + d~.  Examine  (x.  y) 
neighborhoods  of  (x,  y,  d~)  and  of  (x,  y,  d *)  in  the  network  as  it 
is  loaded  so  far.  Assign  the  current  match  to  that  d whose 
neighborhood  contains  more  loaded  cells,  if  one  of  them  does.  Else 
load  this  point  according  to  6.6.1  with  ri  = 0.  5. 

This  process  will  load  most  Images  In  satisfactory  way,  and  the 
read-out  procedure  is  similar  to  that  of  the  previous  case. 


6 A mathematically  tractable  version  of  the  algorithm 

A suitable  choice  of  the  parameter  values  and  of  the  loading  rules 
of  the  algorithm  allows  a complete  mathematical  analysis  of  its 
asymptotic  behavior.  In  this  section  we  introduce  this  "strict" 
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version  of  the  algorithm  and  we  characterize  rigorously  its  properties. 
The  actual  performance  of  this  version  of  the  algorithm  for  various 
random  dot  stereograms  will  be  then  compared  with  the  original 
algorithm. 


6.1  Loading  conditions 

The  initial  state  C®  of  the  network  is  loaded  from  the  stereograms 
L,  n in  a way  similar  to  the  previous  case  but  according  to  eq.  6.1.1 
(instead  of  eq.  2.1.1). 

6.1.1  cx.y,d  ' Lx.y^x*d,y 

where  1.  1 = 0.  0 = 1,  1.  0 = 0. 1 * 0. 

This  loading  rule  can  be  easily  extended  to  cases  in  which  more  than 
two  features  are  present.  It  is  enough  to  define 

6.1.2  f i -f  j " &i jt 

where  fi  and  fj.  (i  * j ) are  two  different  features.  The  case  when 
only  two  features  are  present  clearly  poses  the  hardest  matching 
problem.  We  shall  later  compare  this  loading  rule  with  the  original 
eq.  2.1.1  and  discuss  their  relative  merits  for  real  Images. 
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6.2  The  algorithm 

The  relation  betwen  states  at  times  t and  t+i  is  given  by  (compare 
eq.  l.l.l) 


6.2.2 


t 1 i , 

X 


£ ‘v.y’-.d' 

,y' ,d'eS(x,y,d) 


e I Cx' ,y‘ ;d 1 
x,,y\d,E0(x,y,d) 


where  H is  a number  that  represents  the  "saturation"  value  for  the 
excitation. 


6.3  Choice  of  parameter  values 

In  this  case  the  loading  rules  lead,  for  random  dot  stereograms 
with  two  features,  to  a density  of  1 for  the  "on"  cells  on  the 
"correct"  diagonal  segments  and,  correspondingly,  to  a density  of 
v2  * (1  - v)2  for  the  "on"  cells  on  the  "wrong"  diagonal  segments  (k  is 
the  density  of  I's  in  the  input  images).  When  v = 0.5,  the  density  of 
the  wrong  cells  is  also  0.5;  for  smaller  or  larger  v the  density  is 
higher.  The  idea  behind  this  approach  is  to  choose  parameter  values 
for  the  first  iteration  that  "kill"  most  of  the  "wrong"  cells  (and  of 
course  some  of  the  "right"  ones);  from  the  second  iteration  on,  the 
parameter  values  are  such  to  ensure  "filling-in"  of  the  right  diagonal 
segments,  allowing,  at  the  same  time,  a satisfactory  mathematical 
analysis  of  the  evolution  of  the  network's  state.  This  approach,  which 
is  carried  out  in  the  next  two  sections,  leads  to  the  following 
parameter  values: 
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6.3.1  S and  0 are  as  in  fig.  2,  D « 7 and  H ■ 5 as  before.  Self- 
excitation is  now  included  but  the  C°  term  is  omitted.  We  therefore 
write  E ■ 13  instead  of  12. 

6.3.2  Iteration  h 

H * 13  (so  that  the  inf  operation  can  be  neglected) 

< e 0.  2 

e = 10.  75 

6.3.3  Second  and  subsequent  iterations , 

H = 7 
« » 4.  0 
$ - 3.  5 


6.4  Probabilistic  analysis  of  the  first  iteration 

We  shall  assume  that  the  inputs  have  the  properties  4.1.1  and 
4.1.2.  As  in  section  4.1,  we  distinguish  several  populations  of  cells 
which  are  homogeneous  with  respect  to  the  interaction  structure:  the 
populations  are  again  denoted  by  0,  1,  11,  10,  00  according  to  their 
respective  Inputs  from  the  two  images  (see  section  2),  and  p0,  pt,  etc. 
denote  the  probability  that  a cell  in  the  respective  population  is  "on" 
after  the  first  iteration.  In  this  case  the  formulae  for  the  solution 
layer  are: 
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13-e 


Pi  • ^ i2ct  ~ *) 

1 i=o 


12-1 


13-0 


Pp  £ I2ci'(i  ~ r)1-*12'1 
i=0 

For  the  "wrong"  layers  (writing  p ■ ♦ ('l  - the  formulae  are 

k+1-e  _2 

' I ,A  “k(’-)12'k  i 10S  ^o-v)10-1 

k=0  i=0 

k+1-0  o 


1? 


p00 


■-  c kn  J2-k  i.  r n vl  10-1 

■r  I 12Ck  w £ 10Ck' ' ' v 


k=0 


12 


inf  (--2), 5 inf  -2)-i,5 

£ C 

i / 1 \5-i . V r / 1 \ j,  5-j 


PlO  * ^ 1 2Ckpk^ 1 _VJ ^ ^ 5Civ  ^ 5Cj^"v^ 

k=0  i*0  j=0 


Therefore  the  probability  that  a cell  in  the  solution  layer  is 
"on"  after  the  first  step  is 
pr  ' pj.v  * Po-<l  - v) 

and  the  probability  that  a cell  off  the  solution  layer  is  "on"  after 
the  first  step  is 

P»  * *2pii  * “ ¥*p10  + <*  ~ *’>2P00* 

These  equations  can  be  used  to  find  suitable  parameter  values.  The 
parameters  given  in  the  previous  section  yield  the  values  for  pr  and  pw 
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shown  in  table  3. 

I 

1 

6.5  Equivalent  rules 

1 

The  parameter  values  from  the  second  iteration  on  imply  the 
following  main  "rules"  for  the  algorithm: 

6.5.1  one  "on"  cell  in  the  inhibitory  neighborhood  always  suffices 
kill  an  "on"  cell 

6.5.2  without  inhibition,  at  least  three  excitatory  "on"  cells  are 
needed  for  "survival"  of  an  "on"  cell  and  four  for  its  "birth". 


6.6  dnaltrsis  of  the  second  iteration 

Table  3 gives  the  densities  pr  and  pu  after  the  first  iteration. 
Only  for  the  first  iteration  can  a probabilistic  analysis  provide  a 
reliable  estimate  of  the  density  of  "on"  cells  on  the  solution  surface. 
As  in  our  earlier  analysis  (table  1),  it  becomes  unreliable  for  the 
second  iteration,  because  clusters  of  "on"  cells  can  be  expected  to 
form  in  the  solution  layer  (see  figure  14  below).  Rule  6.5.1  implies, 
however,  that  "wrong"  clusters  will  disappear  after  the  second 
iteration,  unless  they  consist  of  at  least  four  elements.  Moreover, 
these  elements  must  in  practise  be  very  close  together  for  each  xo 
support  the  other  three.  In  addition,  according  to  rule  6.6.2,  none  of 
them  can  lie  in  the  inhibitory  neighborhood  of  other  "on"  cells  (for 


TABLE  3.  The  behavior  of  the  mathematically  tractable  of  the  algorithm, 
together  with  the  probabilistic  theory  of  the  first  iteration, 
for  the  two  stereograms  exhibited  in  figures  14a  and  b. 
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instance  on  the  solution  layer  where  the  density  of  on  cells  is 
relatively  high,  see  table  3).  We  argue  that  the  probability  of  such 
situations  is  very  small  (actually  much  smaller  than  in  the  case 
considered  in  section  4.4).  If  this  occurs  it  can  be  attributed  to  an 
accidental  correlation  in  the  images.  In  this  sense  extended  clusters 
are  in  fact  "right"  solution  regions. 


6.7  Asymptotic  analysis 

The  probabilistic  analysis  of  the  first  iteration  (table  3)  shows 
that  one  can  assume  that,  from  the  second  iteration  onwards,  there  are 
no  wrong  "on"  cells.  It  remains  now  to  show  that  the  density  of  "on" 
cells  on  the  solution  layer  is  high  enough  to  allow  asymptotic  filling- 
in  of  the  "right"  surfaces.  We  prove  the  following: 

6.7.1  Filling- in  Proposition.  Assume  that  (at  some  iteration  n) 
there  are  no  "on"  cells  off  a given  layer  (diagonal),  and  that  the 
density  of  "on"  cells  on  this  layer  exceeds  0.4375  - 7/16.  Then, 
in  the  asymptotic  configuration,  there  are  no  "off"  cells  on  this 
layer. 

Proof:  Divide  the  solution  plane  into  squares  of  4 by  4 cells  (we 
neglect  boundaries).  At  least  one  of  these  squares  must  contain  8 "on" 
cells,  for,  otherwise,  every  square  would  contain  at  most  7 "on"  cells 
yielding  a density  of  at  most  7/16,  in  contradiction  with  the 
hypothesis.  This  square  will  fill  up  with  "on"  cells.  (This  can  be 


! 


seen  by  examining  the  various  possible  ways  In  which  the  8 cells  can  be 
distributed,  and  we  leave  it  as  an  exercise  for  the  reader).  Starting 
from  this  square,  the  whole  plane  will  asymptotically  be  filled  by  "on" 
cells  (since,  by  hypothesis,  no  inhibitory  cells  need  be  considered). 

6.8  Invariant  states  and  matching  rules 

The  matching  rules  were  defined  in  section  3.  States  that  satisfy 
the  matching  rules  with  the  present  parameter  values  are  shown  in  fig. 
12.  In  view  of  the  rules  6.5.1  and  6.5.2,  the  following  clearly  hold: 

i)  Configurations  that  satisfy  the  matching  rules  (fig.  12)  are 
invariant. 

ii)  Conversely,  invariant  configurations  clearly  have  to  obey  the 
uniqueness  condition  (because  of  6.6.1).  The  probabilistic  analysis  of 
the  second  step,  together  with  the  "filling-in"  proposition  6.7.1, 
ensures  in  practise  that  there  will  be  no  holes^  In  the  asymptotic 
invariant  configurations. 

6.9  Asymptotic  Liapunov  description 

Besides  the  invariant  asymptotic  configuration,  limit  cycles  of 
the  type  described  in  figure  13  may  also  occur.  Thus  the  previous 
description  of  asymptotic  invariant  states  Is  nut  complete.  We  provide 
here  an  asymptotic  analysis  in  terms  of  a Liapunov-like  function  which 


also  encompasses  such  non-invariant  states. 

For  a given  state  C*,  we  define  F(C')  to  be  the  number  of  "on" 
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cells  having  no  "on"  cells  in  their  inhibitory  neighborhood.  Ke  call 
an  "on"  cell  that  has  less  than  three  "on"  cells  in  its  excitatory 
neighborhood  a "solitary  cell".  Observe  that  solitary  cells  can  never 
be  "born"  and  that,  after  a finite  number  of  Iterations,  all  solitary 
cells  will  have  disappeared. 

6.9.1  Growth  proposition.  After  a finite  number  of  iterations,  the 
function  i — > F(C i)  is  non-decreasing. 

Froof-.  After  a finite  number  of  iterations  t,  all  solitary  cells  have 
died  out.  Let  us  consider  the  transition  from  C*  to  C***.  If  a new 
cell  is  born,  rule  6.6.1  implies  that  it  cannot  lie  in  the  inhibitory 
neighborhood  of  an  already  present  "on"  cell.  Thus  F will  not  decrease 
(from  C*  to  If  a cell  dies  out,  it  cannot  be  a solitary  cell. 

Therefore  it  must  have  had  an  "on"  cell  in  its  inhibitory  neighborhood 
at  iteration  i.  Thus  F will  not  decrease. 

The  growth  of  F adequately  describes  the  filling-in  process, 
respecting  at  the  same  time  the  "uniqueness"  matching  rule. 

The  growth  proposition  implies  that: 

6.9.2  For  any  initial  configuration  C°,  the  limit  Urn  F(C i)  * F( C) 
exists  (since  F is  bounded  above  by  the  number  of  cells  in  one  layer). 

6.9.3  After  a finite  number  of  iterations,  Fit1)  remains  constant. 


Thus  the  asymptotic  behavior  of  the  system  is  characterized  by  the 
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following:  Apart  from  Invariant  solutions,  only  those  cycles  can 
(asymptotically)  occur  for  which  F remains  constant.  This  is  a strong 
restriction  on  the  possible  asymptotic  oscillatory  states,  for  it  means 
that  they  have  to  be  of  the  type  shown  in  fig.  13.  The  growth 
proposition  by  itself  does  not  exclude  for  Instance  the  "zero” 
invariant  state.  The  probabilistic  analysis  of  the  second  step  of  the 
algorithm  together  with  the  "filling-in"  proposition  ensures,  however, 
that  invariant  states  as  well  as  limit  cycles  will  in  practice  have  no 
"holes". 


6.10  Observations 

6.10.1  Figure  14  shows  the  performance  of  the  algorithm  in  this  form 
for  a few  different  patterns  and  pattern  densities.  A comparison  with 
figure  8 reveals  that  the  type  of  "strategy"  for  achieving  a successful 
matching  is  different:  Firstly,  wrong  cells  are  drastically  eliminated 
at  the  expense  of  losing  many  right  cells,  and  then  filllng-ln  of  the 
surviving  surfaces  takes  place.  This  contrasts  with  the  more 
complicated  "strategy"  revealed  in  figs.  3 - 6 of  Marr  h Poggio  1976. 

It  is  remarkable  how,  while  the  basic  structure  of  the  algorithm 
remains  the  same,  a change  of  parameter  values  and  loading  conditions 
can  bring  about  so  deep  a change  in  the  algorithm's  behavior. 


6.10.2  Because  of  the  rules  of  the  present  algorithm,  especially  rule 
6.6.1,  the  sharpest  corner  capable  of  surviving  (of  course  under  no 
inhibitions)  is  limited  only  by  the  need  for  an  "on"  cell  to  have  at 


14.  The  behavior  of  the  algorithm  with  modified  parameters.  The 
densities  are  50t  (14a)  and  251  (14b).  The  parameters  are  as  stated  in 
section  6.3,  and  In  table  3.  14c  compares  the  two  sets  of  parameters 

on  a stereogram  of  a star  that  contains  arms  of  various  angles.  The 
original  parameters  tend  to  give  a more  accurate  final  configuration. 
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least  three  excitatory  neighboring  cells.  This  allows  a 45  degree 
corner. 

6.10.3  Minimum  size  us.  disparity.  Again  because  of  rule  6.6.1,  the 
minimum  resolvable  area  of  a small  pattern  against  a background  does 
not  depend  on  disparity.  It  is  given  by  the  minimum  self-supporting 
configuration  (four  adjacent  cells,  from  rule  6.5.2). 

This  contrasts  sharply  with  the  property  discussed  in  section  6.4, 
where  the  minimum  size  has  a characteristic  dependence  on  disparity. 

6.10.4  Loading  conditions . While  the  present  loading  rule  is 
characterized  by 

6.10.4.1  J^j  * 6ij,  where  is  a feature  and  hij  is  the 

Kronecker  6, 

the  previous  loading  rule  (section  2)  can  also  be  characterized  by 
equation  6.1.2,  with  the  convention  that  6ij  is  also  zero  when  either  i 
or  J are  zero.  In  other  words  the  "null"  feature  has  a special  status 

V0fj  -•  Jjfo  °'  311 

In  case  of  two-valued  (0,1)  random  dot  stereograms,  the  choice  of 
either  one  of  the  two  loading  rules  is  somewhat  arbitrary.  For 
densities  around  0.5,  the  straight  equation  6.10.4.1  seems  to  make  more 
sense,  since  the  black  and  white  dots  play  equivalent  roles.  This  is 
not  clear,  however,  at  very  low  densities  (nor  at  very  high  ones). 

In  the  case  of  natural  images,  more  than  two  feature  types  have  to 
be  used  (for  Instance,  lines  and  edges  at  various  orientations).  In 
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this  case,  however,  not  every  point  is  labelled  with  a corresponding 
feature;  the  absence  of  any  feature  at  a given  point  is  a common  event. 
The  null  feature  seems  to  have  a basically  different  role  from  the 
other  features.  These  arguments  clearly  support  the  loading  conditions 
used  in  the  first  part  of  the  paper  (see  Marr  8 Poggio  1976).  It  is 
clear,  on  the  other  hand,  that  both  loading  conditions  may  work.  For 
both,  an  increasing  number  of  "feature  types"  implies  of  course  an 
increasingly  better  algorithm  convergence.  The  choice  between  them 
depends  in  the  end  on  the  typical  feature  densities  that  one  wants  to 
deal  with.  For  natural  images,  quantitative  estimates  have  only 
recently  become  possible  (Marr  1976,  1977). 

7 Discussion 

7.1  Alternative  algorithms 

The  algorithm  eq.  1.1.1  can  be  modified  in  various  ways.  One  can 
adopt  alternative  loading  rules  for  the  network  as  in  section  6,  and 
one  can  vary  the  parameters  over  a substantial  range.  Such  apparently 
minor  changes  can  cause  considerable  changes  in  the  network's  behavior, 
but  often  without  changing  the  end  result  (see  for  instance  section  6), 
because  they  still  Implement  the  same  computational  constraints. 

If  the  geometry  of  the  local  interactions  (i.e.  the  shape  of  the 
excitatory  and  inhibitory  neighborhoods)  is  changed,  the  network  will 
in  general  Implement  a different  computation,  because  the  local 
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constraints  will  have  changed.  If  only  the  parameter  values  are 
changed,  our  analysis  (section  3)  may  still  apply.  If  the  geometry  is 
changed,  our  analysis  will  in  general  become  Irrelevant. 

Interestingly,  for  a specific  stereogram  density,  a non-iterative 
version  of  our  algorithm  can  recover  disparity  satisfactorily  (see  fig. 
14a  iteration  1).  John  Fairfield  (personal  communication)  suggested  an 
algorithm  in  which  (1)  excitation  is  summed  independently  within  each 
disparity  layer,  and  (2)  for  each  position,  one  selects  only  the  most 
excited  of  the  cells  in  the  different  disparity  layers.  This  algorithm 
performs  well  for  the  case  v «=  0.  5. 


7.2  Comments  on  analyzing  such  operations 

We  find  the  style  of  analysis  that  we  were  forced  to  adopt  to  be 
unsatisfactory  for  a number  of  reasons.  Firstly,  although  our 
arguments  appear  to  provide  a qualitatively  accurate  description  of  the 
algorithms's  behavior,  the  arguments  are  not  completely  rigorous.  The 
main  reasons  for  this  lie  in  the  difficulty  of  assessing  the  validity 
of  the  randomness  assumptions  that  are  necessary  for  the  probabilistic 
analysis;  and,  to  a lesser  extent,  in  the  need  to  examine  a number  of 
special  cases  in  order  to  establish  the  stability  of  various  solutions. 

Secondly,  our  analysis  is  very  specific  to  the  particular 
algorithm  and  the  particular  parameters.  This  style  of  proof  cannot 


lead  to  any  general  results  about  the  convergence  of  such  operators. 
In  order  to  overcome  the  first  of  these  problems  one  can  follow 
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the  approach  of  section  6.  The  price  one  pays  is  that  the  analysis  is 
valid  for  a narrower  parameter  range,  which  happens  not  to  include  the 
original  parameter  values  (see  Marr  6 Poggio  1976).  The  difficulty 
with  the  assumption  of  randomness  arises  because  of  the  constant 
spatial  structure  of  the  operator  E (eq.  (1)  and  fig.  2c  of  Marr  6 
Poggio  1976).  It  should  perhaps  be  noted  that  this  objection  does  not 
apply  to  the  similar  analysis  given  by  Marr  (1971)  of  a cooperative 
associative  memory  algorithr  because  there  the  local  operator  had  a 
variable  and  essentially  random  structure. 

The  second  of  these  difficulties  seems  to  be  inherent  in  the 


nature  of  this  type  of  cooperative  algorithm.  No  general  approach  is 
at  present  available.  Standard  approaches4  that  we  have  tried  have 
failed  up  to  now.  The  flavor  of  the  difficulties  is  the  following.  A 
configuration  that  is  stable  may  be  perturbed  by  changing  a large 
number  of  cells  without  affecting  its  asymptotic  state,  provided  that 
the  perturbed  cells  are  well  scattered  and  interior.  On  the  other 
hand,  one  fixed  point  of  the  algorithm  can  be  shifted  into  another  by 
perturbing  only  a few  cells,  provided  that  they  have  a suitable 
configuration.  Thus,  the  usual  distance  between  two  configurations, 
namely  the  number  of  cells  having  different  states,  does  not  reflect 
the  behavior  of  the  algorithm.  Therefore,  the  problem  seems  to  be  how 
to  incorporate  the  geometry  of  the  interactions  into  the  metric 
distance  between  configurations. 

It  seems  unlikely  that  one  can  construct  a useful  general  theory 
of  algorithms  of  the  form 
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7.2.1  C"*1  * 9(L(C*», 

where  t Is  a linear  operator  on  the  vector  C,  and  r Is  i nonlinear 
(coordinate-wise)  function.  J.  H.  Conway's  game  "Life"  can,  for 
example,  be  written  this  way  (see  fig.  15)  and  with  an  appropriate 
input  pattern  is  Turing  universal  (unpublished  result  discovered 
independently  by  J.  H.  Conway  and  R.  W.  Gosper). 

This  suggests  that  theories  of  this  type  of  algorithm  must  take 
due  account  of  the  structure  of  the  input  data  and  will  probably  be 
restricted  to  very  specific  forms  of  eq.  7.2.1. 

A mathematical  understanding  of  the  behavior  of  eq.  7.2.1  would 
represent  a breakthrough  of  rather  general  importance.  Cooperative 
phenomena  similar  to  those  which  can  be  described  by  eq.  7.2.1  are 
important  in  physics  (Haken  1977,  Kawasaki  1972,  K.  G.  Wilson  1975),  in 
development  (Mostow  1975),  and  in  biology  (Eigen  1971,  Marr  1971, 
Richter  1976). 

Furthermore,  such  a theory  might  also  allow  one  to  synthesize  in  a 
standard  way  cooperative  algorithms  of  the  form  of  eq.  7.2.1  from  an 
analysis  of  the  constraints  on  a computation. 

Acknowledgements:  We  thank  K.  P.  Hadeler  and  W.  Relchardt  for 
Interesting  discussions,  Karen  Prendergast  for  preparing  the 
illustrations,  and  the  Mathlab  Group  for  allowing  us  to  run  the 
probabilistic  theory  irt  MACSYMA,  the  M.  I.T.  symbolic  algebraic 


r 


■1 


J 


I 


IS.  Conway's  game  "life,”  which  is  played  on  an  Infinite  plane  square 
lattice,  may  be  represented  in  a aanner  very  similar  to  that  of  our 
stereo  operator.  The  excitatory  neighborhood,  together  with 
appropriate  weights,  is  shown  in  ISa,  and  the  threshold  function 
appears  in  ISb.  This  combination  reproduces  the  rules  of  life  exactly, 
and  these  are; 

(1)  A cell  will  die  at  generation  *♦!  if  < 2 or  > 3 of  its  8 neighbors 
are  alive  at  generation  a (death  by  starvation  or  overfeeding). 

(2)  A cell  with  exactly  2 living  neighbors  at  generation  a will  be 
alive  at  generation  n+i  it  and  only  If  It  Is  alive  at  generation  a. 

(3)  A cell  with  exactly  3 living  neighbors  at  generation  a will  be 
alive  at  generation  a+i. 
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Footnotes 

1.  In  Marr  6 Poggio  (1976),  the  value  of  0 was  given  as  3.0,  whereas 
here  it  is  4.0.  The  reason  for  the  discrepancy  is  that  the  algorithm 
used  to  produce  the  stereograms  for  that  article  essentially  used  the 
condition  > 6,  whereas  here,  we  use  the  condition  * 0. 

2.  A configuration  is  "spatially  stable"  if  it  is  in  some  sense 
invariant  under  small  perturbations  (for  Instance  each  active  point  can 
be  required  to  belong  to  a 3 x 3 neighborhood  of  points  with  the  same 
disparity). 

3.  There  is  a hole  in  the  network  for  a given  y if  there  exist  two 
intersecting  lines  of  sight  neither  of  which  contains  an  "on"  cell. 


: 


4.  The  continuous  version  of  the  algorithm  eq.  1. 1. 1 cannot  be 
described  in  terms  of  a potential  dynamics.  In  fact  the  dynamical 
system 


e 


C --  ofZC}  - C --  f(C)  <9  a "smooth"  threshold) 
does  not  admit  a scalar  potential  function  V(C)  such  that 
lS(C)ix  * ai/fo/dcX(tf>d. 

A necessary  condition  for  this  to  be  true  is  that 


3Cx  v-d  fxVd'^  = »c  . a11  x’y»d’x’ ‘d' 

* 9j  a ,y  »u 

This  is  not  true  in  general,  because  of  the  nonlinearity  9 (consider 
the  case  in  which  xyd  and  x'y'd’  are  on  the  same  disparity  layer  and 
are  reciprocally  excitatory). 
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